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Abstract. Associative network models featuring multi-tasking properties have been introduced recently 
and studied in the low load regime, where the number P of simultaneously retrievable patterns scales with 
the number of nodes as P ~ logA^. In addition to their relevance in artificial intelligence, these models 
are increasingly important in immunology, where stored patterns represent strategies to fight pathogens and 
nodes represent lymphocyte clones. They allow us to understand the crucial ability of the immune system 
to respond simultaneously to multiple distinct antigen invasions. Here we develop further the statistical 
mechanical analysis of such systems, by studying the medium load regime, P ~ with S G (0,1]. We 
derive three main results. First, we reveal the nontrivial architecture of these networks: they exhibit a high 
degree of modularity and clustering, which is linked to their retrieval abilities. Second, by solving the model 
we demonstrate for 6 < 1 the existence of large regions in the phase diagram where the network can retrieve 
all stored patterns simultaneously. Finally, in the high load regime 6 = 1 we find that the system behaves 
as a spin glass, suggesting that finite-connectivity frameworks are required to achieve effective retrieval. 
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1. Introduction 

After a pioneering paper [Ij followed by a long period of dormancy, recent years have witnessed a surge of 
interest in statistical mechanical models of the immune system [2j|3l|4l|5l|6l[71[8l[9l [10]. This description 
complements the more standard approaches, which tend to be phrased in the language of dynamical systems 
pn [T2l [T3l [T4) . To make further progress, however, it has become clear that we need new quantitative tools, 
able to handle the complexities which surfaced in e.g. [151 [16]. This is the motivation for the present study. 

There is an intriguing and fruitful analogy (from a modelling perspective) between neural networks, 
which have been modelled in statistical mechanics quite extensively, and immune networks. Let us highlight 
the similarities and differences. In neural networks the nodes represent neurons, which interact with each 
other directly through Hebbian synaptic couplings. In (adaptive) immune systems, effector branches (B- 
clones) and coordinator branches (helper and suppressor T-clones), interact via signaling proteins called 
cytokines. The latter can represent both eliciting and suppressive signals. Neural and immune systems are 
both able to learn (e.g. how to fight new antigens), memorize (e.g. previously encountered antigens) and 
'think' (e.g. select the best strategy to cope with pathogens). However, neural networks are designed for 
serial processing: neurons perform collectively to retrieve a single pattern at a time. This is not acceptable 
in the immune context. Multiple antigens will normally be present at the same time, which requires the 
simultaneous recall of multiple patterns (i.e. of multiple defense strategies). Moreover, the architectures 
of neural and immune networks are very different. A model with fully connected topology, mathematically 
convenient but without a basis in biological reality, is tolerable for neural networks where each neuron is 
known to have a huge number of connections with others. In contrast, in immune networks, where interactions 



among lymphocytes are much more specific, the underlying topology must be carefully modelled and is 
expected to play a crucial operational role. From a theoretical physics perspective, a network of interacting 
B- and T-cells resembles a bipartite spin glass. It was recently shown that such bipartite spin glasses exhibit 
retrieval features which are deeply related to their structures [l5j[TBJ, and this can be summarized as follows: 

• There exists a structural equivalence between Hopfield neural networks and bipartite spin glasses. In 
particular, the two systems share the same partition function, and hence the same thermodynamics 

[IIlIIH]. 

• One can either dilute directly a Hopfield network, or its underlying bipartite spin glass. The former 
does not affect pattern retrieval qualitatively fi9 \ 120) \2T\ \22 \ [23l [24 ] , whereas the latter causes a switch 
from serial to parallel processing [15l[l^ (i.e. to simultaneous pattern recall). 

• Simultaneous pattern recall is essential in the context of immunology, since it implies the ability to 
respond to multiple antigens simultaneously. The analysis of such systems requires a combination of 
techniques from statistical mechanics and graph theory. 

The last point is the focus of the present paper, which is organized as follows. In Section 2 we describe a 
minimal biological scenario for the immune system, based on the analogy with neural networks. We define 
our model and its scaling regimes, and prepare the stage for calculations. Section 3 gives a comprehensive 
analysis of the topological properties of the network in the extremely diluted regime, which is the scaling 
regime assumed throughout our paper. Section 4 is dedicated to the statistical mechanical analysis of the 
system in the medium load regime, focusing on simultaneous pattern recall of the network. Section 5 deals 
with the high load regime. Here the network is found to behave as a spin glass, suggesting that a higher 
degree of dilution should be implemented - in remarkable agreement with immunological findings [25] 126) - 
and this will be the focus of future research. The final section gives a summary of our main conclusions. 

2. Statistical mechanical modelling of the adaptive immune system 
2.1. The underlying biology 

All mammals have an innate (broad range) immunity, managed by macrophages, neutrophils, etc., and an 
adaptive immune response. The latter is highly specific for particular targets, handled by lymphocytes, and 
the focus of this paper. To be concise, the following introduction to the adaptive immune system has already 
been filtered by a theoretical physics perspective, and immunological observables are expressed in'physical' 
language. We refer to the excellent books [25] l26] for comprehensive reviews of the immune system, and to 
a selection of papers [H |3[ 01 [151 IISl [2Zl for explanations of the link between 'physical' models and biological 
reality. Our prime interest is in B-cells and in T-cells; in particular, among T-cells, in the subgroups of 
so-called 'helpers' and 'suppressors'. B-cells produce antibodies and present them on their external surface 
in such a way that they are able to recognize and bind pathogenic peptides. All B-cells that produce the 
same antibody belong to the same clone, and the ensemble of all the different clones forms the immune 
repertoire. This repertoire is of size (9(10^ — 10^) clones in humans. The size of a clone, i.e. the number 
of identical B-cells, may vary strongly. A clone at rest may contain some O{10^ — 10^) cells, but when it 
undergoes clonal expansion its size may increase by several orders of magnitude, to up to 0(10^ — 10^). 
Beyond this size the state of the immune system would be pathological, and is referred to as lymphocytosis. 

When an antigen enters the body, several antibodies (i.e. several B-cells belonging to different clones) 
may be able to bind to it, making it chemically inert and biologically inoffensive. In this case, conditional 
on authorization by T- helpers (mediated via cytokines), the binding clones undergo clonal expansion. This 
means that their cells start duplicating, and releasing high quantities of soluble antibodies to inhibit the 
enemy. After the antigen has been deleted, B-cells are instructed by T-suppressors, again via cytokines, to 
stop producing antibodies and undergo apoptosis. In this way the clones reduce their sizes, and order is 
restored. Thus, two signals are required for B-cells to start clonal expansion: the first signal is binding to 
antigen, the second is a 'consensus' signal, in the form of an eliciting cytokine secreted by T-helpers. This 
latter mechanism prevents abnormal reactions, such as autoimmune manifestation^ 

:j: Through a phenomenon cahed 'cross-hnking', a B-ceh can also have the abihty to bind a self-peptide, and may accidentally 
start duplication and antibody release, which is a dangerous unwanted outcome. 
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Figure 1. Left: schematic representation of the bipartite spin-glass which models the interaction between 
B- and T-cells through cytokines. The latter are drawn as colored links, with red representing stimulatory 
cytokines (positive couplings) and black representing inhibiting ones (negative couplings). Note that the 
network is diluted. Right: the equivalent associative multitasking network consisting of T-cells only, obtained 
by integrating out the B-cells. This network is also diluted, with links given by the Hebbian prescription. 

T- helpers and T-suppressors are lymphocytes that work 'behind the scenes', regulating the immune 
response by coordinating the work of effector branches, which in this paper are the B-cells. To accomplish 
this, they are able to secrete both stimulatory and suppressive chemical signals, the cytokines ^28^ 29j. If 
within a given (small) time interval a B-clone recognizes an antigen and detects an eliciting cytokine from a 
T-cell, it will become activated and start duplicating and secreting antibodies. This scenario is the so-called 
'two-signal model' [30l [HIl 1321 133] . Conversely, when the antigen is absent and/or the cytokine signalling is 
suppressive, the B-cells tuned to this antigen start the apoptosis programme, and their immuno-surveillance 
is turned down to a rest state. For simplicity, we will from now on with the term 'helper' indicate any helper 
or suppressor T-cell. The focus of this study is to understand, from a statistical mechanics perspective, the 
ability of helpers and suppressors to coordinate and manage simultaneously a huge ensemble of B-clones 
(possibly all). 

2.2. A minimal model 

We consider an immune repertoire of Nb different clones, labelled by /i G {1, A/'^}. The size of clone 
/i is h^. In the absence of interactions with helpers, we take the clone sizes to be Gaussian distributed; 
without loss of generality we may take the mean to be zero and unit width, so ~ A/'(0, 1). A value ^ 
now implies that clone /i has expanded (relative to the typical clonal size), while 6^ <C implies inhibition. 
The Gaussian clone size distribution is supported both by experiments and by theoretical arguments [4). 
Similarly, we imagine having Nt helper clones, labelled by i G {1, A/'t}- The state of helper clone i is 
denoted by <j^. For simplicity, helpers are assumed to be in only two possible states: secreting cytokines 
{ci = +1) or quiescent (a^ = —1). Clone sizes and the helper states are dynamical variables. We will 
abbreviate a = (<Ji, . . . , ctat^) G { — 1, 1}^^, and b = (6i, . . . , bNs) ^ • 

The interaction between the helpers and B-clones is implemented by cytokines. These are taken to be 
frozen (quenched) discrete variables. The effect of a cytokine secreted by helper i and detected by clone 
/i can be nonexistent (^f = 0), excitatory (^^ = 1), or inhibitory (^^ = — 1). To achieve a Hamiltonian 
formulation of the system, and thereby enable equilibrium analysis, we have to impose symmetry of the 
cytokine interactions. So, in addition to the B-clones being influenced by cytokine signals from helpers, the 
helpers will similarly feel a signal from the B-clones. This symmetry assumption can be viewed as a necessary 
first step, to be relaxed in future investigations, similar in spirit to the early formulation of symmetric spin- 
glass models for neural networks |34l [35j. We are then led to a Hamiltonian H(6, t7|^) for the combined 
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system of the following form (modulo trivial multiplicative factors): 

Nt Nb Nb 



^ ^ 2=1 n—l ^ 11=1 



In the language of disordered systems, this is a bipartite spin-glass. We can integrate out the variables 6^, 
and map our system to a model with helper-helper interactions only. The partition function at 
inverse clone size noise level \/P (which is the level consistent with our assumption ~ A/'(0, 1)) follows 
straightforwardly, and reveals the mathematical equivalence with an associative attractor network: 

ZnAP,0=Y. [dbi...dbN^exp[-./^H{b,CT\0] 
a •' 

= ^exp[-/3H(<T|e)], (2) 
a 

in which, modulo an irrelevant additive constant, 

. Nt Nb 

ncr\i) = - 2 E ^^J^^^^^ = AT Ec"e; (3) 

ij=i ^ fi=i 

Thus, the system with Hamiltonian H(6, cr|^), where helpers and B-clones interact through cytokines, is 
thermodynamically equivalent to a Hopfield-type associative network represented by H(cr|^), in which helpers 
mutually interact through an effective Hebbian coupling. See Figure [T] Learning a pattern in this model 
then means adding a new B-clone with an associated string of new cytokine variables. 

If there are no zero values for the {^f }, the system characterized by (l3| is well known in artificial 
intelligence research. It is able to retrieve each of the Nb 'patterns' (^f , . . . , ^^), provided these patterns are 
sufficiently uncorrelated, and both the ratio a = Nb/Nt and the noise level are sufficiently small 
|4i [201 133 mi- Retrieval quality can be quantified by introducing Nb suitable order parameters, viz. 
^/i(cr) = A^^"^ ii^i^ terms of which the new Hamiltonian ^ can be written as 

N 

If a is sufficiently small, the minimum energy configurations of the system are those where m^{a) = 1 for 
some /i ('pure states'), which implies that a = (^f , . . . ,^^) and pattern fi is said to be retrieved perfectly. 
But what does retrieval mean in our immunological context? If m^(cr) = 1, all the helpers are 'aligned' with 
their coupled cytokines: those i that inhibit clone /i (i.e. secrete = —1) will be quiescent (<j^ = —1), and 
those i that excite clone /i (i.e. secrete = 1) will be active (a^ = 1) and release the eliciting cytokine. As 
a result the B-clone fi receives the strongest possible positive signal (i.e. the random environment becomes 
a 'staggered magnetic field'), hence it is forced to expand. Thus the arrangement of helper cells leading to 
the retrieval of pattern /i corresponds to clone- specific excitatory signalling upon the B-clone fi. 

However, if all G { — 1, 1} so the bipartite network is fully connected, it can expand only one B-clone 
at a time. This would be a disaster for the immune system. We need the dilution in the bipartite B-H 
network that is caused by having also = (i.e. no signalling between helper i and clone /i), to enable 
multiple clonal expansions. The associative network (|3| now involves patterns with blank entries, and 'pure 
states' no longer work as low energy configurations. Retrieving a pattern no longer employs all spins cr^, and 
those corresponding to null entries can be used to recall other patterns. This is energetically favorable since 
the energy is quadratic in the magnetizations m^(cr). Conceptually, this is only a reshaping of the network's 
recall tasks: no theoretical bound for information content is violated, and global retrieval is still performed 
through Nb bits. However, the perspective is shifted: the system no longer requires a sharp resolution in 
information exchange between a helper clone and a B-clon^ It suffices that a B-clone receives an attack 
signal, which could be encoded even by a single bit. In a diluted bipartite B-H system the associative 

§ In fact, the high-resolution analysis is performed in the antigenic recognition on the B-cell surface, which is based on a sharp 
key-and-lock mechanism [2 . 
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capabilities of the helper network are distributed, in order to simultaneously manage the whole ensemble of 
B-cells. The analysis of these immunologically most relevant pattern-diluted versions of associative networks 
is still at an early stage. So far only the low storage case Nb ^ log Nt has been solved [15l [16]. In this 
paper we analyse the extreme dilution regime for the B-H system, i.e. Nb ^ N^ with < S < 1. 



3. Topological properties of the emergent networks 

3.1. Definitions and simple characteristics 

We start with the definition of the bi-partite graph, which contains two sets of nodes (or vertices): the set 
Vb representing B-cells (labelled by /i) and the set Vt representing T-cells (labelled by i), of cardinality 
Nb and Nt, respectively. Nodes belonging to different sets can be pairwise connected via links, which are 
identically and independently drawn with probability p, in such a way that a random bipartite network B is 
built. We associate with each link a weight, which can be either +1 or —1; these weights are quenched and 
drawn randomly from a uniform distribution. As a result, the state of each link connecting the /i-th B-clone 
and the i-th T-clone can be denoted by a random variable , distributed independently according to 

PiO = f + + (1-P)%,0 (5) 

We choose p = c/N^, with 7 G [0, 00) subject to p < 1, and c = 0{N^). Upon tuning 7, B displays 
different topologies, ranging from fully connected (all Nt x Nb possible links are present, for 7 ^ 0) to fully 
disconnected (for 7 00). We have shown in the previous section how a process on this bipartite graph can 
be mapped to a thermodynamically equivalent process on a new graph, built only of the Nt nodes in Vr, 
occupied by spins that interact pairwise through a coupling matrix with (correlated) entries 

Nb 

j^j = Y.^e;- (6) 

The structure of the marginalized system is represented by a weighted mono-partite graph with weights (|6|, 
whose topology is controlled by 7. To illustrate this, let us consider the weight distribution P{J\Nb, Nt, 7, c), 
which can be interpreted as the probability distribution for the end-to-end distance of a one-dimensional 
random walk. This walk has a waiting probability = 1 — p, and probabilities of moving left {pi) or right 
(Pr) equal to Pi = Pr = p/2, i.e. 

p^ = l-{c/N^)\ pi=p,= ^-{c/N^)\ (7) 



Therefore, we can write 



L-J 



P{J\Nn Nt 1 C)= V' ^ ^{Nb-S+J)/2 {Nb-S-J)I2 . . 

r-\^j|iVB,iVT, 7,c; ^ g^i ^ Ns-S-J y (^ NB-S+J y Pr Pi ^ 

where the prime indicates that the sum runs only over values of S with the same parity as Nb ± J- The 
result (|8| can easily be generalized to the case of biased weight distributions [38j , which would correspond 
to non-isotropic random walks. The first two moments of (|8| are, as confirmed numerically in Figure [2] 

(J) = 0, (J^) = (c/N^f Nb, (9) 

We now fix a scaling law for Nb, namely Nb — aN^, with a > 0. This includes the high- load regime for 
6 = 1, as well as the medium- load regime for 5 G (0, 1). The low-storage regime ^ = has already been 
treated elsewhere [151 Ull • We then find 

{J^)=acyNp-^, (10) 

The probability of having J ^ scales like N^~'^^ for 27 > (5, while for 27 < (5 it approaches 1 in the limit 
Nt 00. One can recover the same result via a simple approximation, which is valid in the case <C 1: 



P{J = 0\Nb,Nt,^,c) ^p^-= l-{c/N^y , (11) 
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Figure 2. Statistical properties of individual links in randomly generated instances of the graph Q at 
different sizes Nt, with A^^ = aN^. We measured the mean coupling (J) (left), the mean squared coupling 
(J^) (middle) and the probability P(J = 0) of a zero link (right), for different values of 7. The parameters 
6 = 1 and a = 0.5 are kept fixed. Solid lines: predictions given in ([9} and ( |12| ). Markers: simulation data. 



see also Appendix A.l| fQr a more rigorous derivation of P(J = 0|A/'b, A/'t, 7, c). Given the assumed scaling 
of Nb, we get P{J ^ 0\NB{a,S, Nt), Nt.j.c) ^ 1 - e-"^'^^'"'^, which translates into 

r ac^Nr^-^^ if 2j > 5 

P{J^O\NB{a,S,NT),NTn,c)^ I I - e'^'" if 27 = J , (12) 

[ 1 if 27 < (5 

This quantity can be interpreted as the average link probability in Q. The average degree z over the whole 
set of node^can then be written as 

z = NtP{J ^^). (13) 
Thus, if we adopt a mean-field approach based only on the estimates ( 12|13 ), we find that Q can display the 



following topologies, expressed in terms of the average degree z oi Q (the average number of links per node): 





< 7 < 1 




7=1 




(5 < 27 - 1 


fully disconnected. 


z^O 


fully disconnected. 


z^O 


(5 = 27 - 1 


finitely connected. 


z = 0{l) 


finitely connected. 


z = 0{l) 


27 - 1 < (5 < 27 


extremely diluted. 


2—7-00 but z/Nt - 


-> 




5 = 27 


finitely diluted, z - 


= 0{Nt) 






5 > 27 


fully connected, z 


= Nt 







The missing entries in the table correspond to forbidden values S ^ (0,1]. The various cases are also 
summarized in the left panel of Fig. [3j This picture is confirmed by numerical simulations. The right panels 
of Fig. [3] give a finite size scaling analysis for the average degree z and its fiuctuations — measured 
in realizations of Q for several choices of parameters. We also show corresponding data for Erdos-Renyi 

II The degree or coordination number Zi of node i is the number of its nearest-neighbors, i.e. the number of links stemming 
from the node itself. Thus, the average degree z = J^ieVr ^^/-^T measures the density of links present in the graph. 
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Figure 3. Left: qualitative phase diagram describing the different topological regimes of according 
to the mean-field analysis in Sec. [s] Right panels: finite-size scaling for the average degree z (upper) and 
fiuctuations z'^ — z? (lower), measured on realizations of Q for different choices of parameters 7 and 5 (see 
legend). The parameters c = 1 and a. = 0.1 are kept fixed,. The markers correspond to numerical data, and 
the lines connecting the markers are guides to the eye. The cases being compared include fully disconnected 
(FD), extremely diluted (ED), finitely diluted (FD) and fully connected (FC) regimes, and they are shown 
together with data on Erdos-Renyi graphs with link probability q = 1 — e^^^ (dashed lines, see Eq.|l2|. 

For ER graphs one expects z/Nt to be constant, and equal to g, while the normalized fiuctuations are 
expected to be (1 — z/Nt)z/Nt- Different points pertain to different graph sizes, but with the same link 
probability g; they are found to overlap regardless of their size (•). For Qj the behaviour of z is consistent 
with mean-field expectations, while connectivity fiuctuations are underestimated by the mean-field approach. 



graphs, where ah hnks are independently drawn with probabihty for comparison (here z = qNt and 
— = NTq{l — q)). We find that (i) in the fully disconnected regime of the phase diagram, z decays 
to zero exponentially as a function of TVt, (ii) in the extremely diluted regime, z scales with Nt according 
to a power law, (iii) in the finitely diluted regime z is proportional to Nt, and (iv) in the fully connected 
regime z saturates to Nt. There is thus full agreement with the predictions of the mean- field approach. 
The fluctuations are slightly larger that those of a purely randomly drawn network, which suggests that Q 
exhibits a certain degree of inhomogeneity. This will be investigated next. 

It is important to stress that, as the system parameters 7 and 6 are tuned, the connectivity of the 
resulting network Q can vary extensively and therefore, in order for the Hamiltonian (|4| to scale linearly 
with the system size Nt, the prefactor 1/Nt of the Hopfield model embedded in complete graphs, is not 
generally appropriate. One should normalize H(cr|^) according to the expected connectivity of the graph. 



3.2. Component size distribution 

As 7 is increased, both B and Q become more and more diluted, eventually under-percolating. The topology 
analysis can be carried out more rigorously for the (bi-partite) graph S, since its link probability p = c/Nj. 
is constant and identical for all We can apply the generating function formalism developed in pQ) |4Q] 

to show that the size of the giant component (C Vt,Vb) diverges when 

p^ = 1/NbNt. (14) 
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Figure 4. Left: distribution of component sizes of the graph Q, for S = c = 1, a = 0.1 and different 
values of 7. Symbols represent simulation data; the solid line represents the analytical estimate for the 
underpercolated case (□). Right: degree distributions obtained for different values of 6 and 7, with a = 0.1 
and c = 1 kept fixed. The semi-logarithmic scale highlights the exponential decay at large values of z. 



Hence, upon setting Nb = aN^, the percolation threshold for the bipartite graph B is defined by 

Np-^-^ = c'a = 0{N^) 7 = ((5 + l)/2, (15) 

which is consistent with the results of Section [3j we refer to [Appendix A. 2 for full details. Below 



the percolation threshold the generating function formalism also allows us to get the distribution 
Pb{s\Nt^ Nb^c^j) for the size s of the small components occurring in B. A (connected) component of 
an undirected graph is an isolated subgraph in which any two vertices are connected to each other by paths; 
the size of the component is simply the number of nodes belonging to the component itself. We prove in 



Appendix A. 2 that just below the percolation threshold, P]s{s\Nb^ Nt^ c, 7) scales exponentially with s. One 
finds that this is true also for the distribution Pg{s\NBT Nt^ c^j) of graph Q (see Figure |4j left panel). 

Interestingly, the small components of Q that emerge around and below the percolation threshold play 
a central role in the network's retrieval performance. To see this, one may consider the extreme case where 
the bi-partite graph B consists of trimers only. Here each node /i G is connected to two nodes ^1,^2 ^ Vr, 
that is l^fj = l^^l = 1 and = — O^Vi^ 7^ /i. The associated graph Q is then made up of dimers 
(21,^2) only, and Jij G { — 1,0,1} for all (i,^). The energetically favorable helper cell configuration a is 
now the one where ^^(.iCFi = ±2, for any /i. This implies that retrieval of all patterns is accomplished 
(under proper normalization). In the opposite extreme case, B is fully connected, and the helper cell system 
becomes a Hopfield network where parallel retrieval is not realized. In general, around and below the 
percolation threshold, the matrix ^ turns out to be partitioned, which implies that also the coupling matrix 
J is partitioned, and each block of J corresponds to a separate component of the overall graph Q. For 
instance, looking at the bipartite graph S, a star-like module with node /i G Vb at its center and the nodes 
^1,^2, -,^71 ^ Vt as leave^can occur when the leaves share a unique non-null fi-th entry in their patterns, 
that is = l^^l = ... = l^^l = 1. For the graph Q this module corresponds to a complete sub-graph Kn 
of n < Nt nodes. In this case the retrieval of pattern /i is trivially achieved. In fact, a complete sub-graph 
Kn in Q can originate from more general arrangements in B: each leaf i can display several other null entries 
beyond /i, but these are not shared, that is ^J'^J = V j G Vr, ^ 7^ M ^ ^b- For instance, all stars with 
centers belonging to Vb and with leaves of length 1 or 2 fall into this extended class. Again, the retrieval of 
pattern fi and possibly of further patterns u is achieved. However, the mutual signs of magnetizations are 
no longer arbitrary, as the terms m^(^^ = rriy^^^ are subject to constraints. 

We can further generalize the topology of components in Q compatible with parallel retrieval, by 
considering cliques (i.e. subsets of nodes such that every two nodes in the subset are connected by an 

^ The opposite case of a star-like module with the center belonging to Vt is unlikely, given that Njp > Nb ■ 
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Figure 5. Plots of numerically generated graphs Q for A^^ = 10^ and a = 0.1, and different combinations 
of All isolated nodes have been omitted from the plots. The chosen parameter combinations all give 

graphs that are just below the percolation threshold 6 = 2j — 1. However, the two graphs at the top satisfy 
the further condition 5 < 7 that marks the ability of simultaneous pattern retrieval (via weakly connected 
small cliques). In the bottom graphs this latter condition is violated, so these would behave more like 
conventional Hopfield networks (here simultaneous retrieval of multiple patterns is not possible). 

edge), which are joined together by one hnk: each chque consists of nodes G Vr that share the same non-nuh 
entry, so that the unique hnk between two chques is due to a node displaying at least two non-null entries. 
This kind of structure exhibits a high degree of modularity; each clique is a module and corresponds to a 
different pattern. As for retrieval, this arrangement works fine as there is no interference between the signal 
on each node in Vt- For this arrangement to occur a sufficient condition is that B is devoid of squares, so 
that two nodes G Vt do not share more than one neighbor. This implies that, among the n nodes connected 
to j ^ Vb^ the probability that any number k > 2 of these display another common neighbor is vanishing: 

lim Ef^V(l-P)"-'= |l-fl-^)"-^^fl-l^) l=0- (16) 

AT^^oo^V/cr ^ TVt^oo I \ N^J N^J J ^ ^ 

k>2 

Since n ~ N^~^ , we obtain the condition 7 > (5. Examples of numerically generated graphs for different 
choices of parameters, are shown in Fig. [5] 
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Figure 6. Histograms for P{C\Nb-, Nt, 7, c), normalized such that P(C\Nb, Nt,J, c) = 1 for those values 
of C that have the lowest occurrence frequency. This normalization allows to highlight the intermediate 
region, where the frequencies are much lower than those pertaining to C = and C = 1. The results shown 
here refer to graphs with Nj- = 10^ nodes, a = 0.1 and c = 1, while 6 and 7 are varied. 



3.3. Clustering properties 

We saw that in the operationahy most important parameter regime the graph Q is built of smah chques 
which are poorly (if at all) connected to each other. This means that non-isolated nodes are highly clustered 
and the graph will have a high degree of modularity, i.e. dense connections between nodes within the same 
'module', but sparse connections between nodes in different 'modules'. The clustering coefficient Ci of a 
node i measures how close its Zi neighbors are to being a clique. It is defined as 

Ci = 2E,/zi{z,-l), (17) 

where Ei is the number of links directly connecting nodes pairs in Vi, while \zi{zi — 1) is the total number 
of non-ordered node pairs in Vi. Hence Ci G [0, 1]. The average clustering coefficient C = N^^j. Xl^^y^ ^ Ci 
measures the extent to which nodes in a graph tend to cluster together. It is easy to see that for a bipartite 
graph, by construction, = for any node, while for homogeneous graphs the local clustering coefficients are 
narrowly distributed around C. For instance, for the Erdos-Reny random graph, where links are identically 
and independently drawn with probability the local coefficients are peaked around q. As for our graphs ^, 
due to their intrinsic inhomogeneity, the global measure C would give only limited information. In contrast, 
the distribution P{C\Nb-,Nt-,^-,c) of local clustering coefficients informs us about the existence and extent 
of cliques or 'bulk' nodes, which would be markers of low and high recall interference, respectively. Indeed, 
as shown in Figure [6j in the highly-diluted regime most of the nodes in Q are either highly clustered, i.e. 
exhibiting Q = 1, or isolated, with Ci = 0, whereas the coefficients of the remaining nodes are distributed 
around intermediate values with average decreasing with 7, as expected. In particular, when both 6 and 7 
are relatively large, P{C\Nb-,Nt-,^-,c) approaches a bimodal distribution with peaks at C = and C = 1, 
whereas when 5 is sufficiently larger than 7, there exists a fraction of nodes with intermediate clustering 
which make up a bulk. Therefore, although the density of links is rather small, the average clustering 
coefficient is very high, and this is due to the fragmentation of the graph into many small cliques. 

To measure the extent of modular structures we constructed the topological overlap matrix T, whose 
entry Tij = ^j^^i j CikCjk/ Zi G [0,1] returns the normalized number of neighbors that i and j share. The 
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Figure 7. Overlap matrix T with entries Tij = XlfcT^i j ^ikCjk/zi- The nodes are ordered such that nodes 
with large overlaps are adjacent, and the most significant part of T is around the diagonal. Note: Tu = 1 for 
all i, by construction. Darker colours correspond to larger entries, and any extended coloured zone denotes 
a module, i.e. a set of nodes that are highly clustered and possibly not connected with the remaining nodes. 
These plots refer to numerically generated graphs with A^^ = 10^ nodes, a = 0.1 and c = 1, while 6 and 7 
are varied. To avoid cluttering of the figures, only a fraction 200 x 200 of each pattern is shown. A similar 
fraction of the overlap matrix for an extremely diluted Erdos-Renyi graph made of 10^ nodes is also depicted 
for comparison (inset). In this case there is no evidence of modularity; T displays a homogeneous pattern. 



related patterns for several choices of parameters are shown in the plots of Fig. [7| and compared to those of 
Erdos-Renyi graphs Q^^l- Erdos-Renyi graphs T displays a homogeneous pattern, that is very different 
from the highly clustered cases emerging from Q. In particular, for the highly diluted cases considered here, 
we find that smaller values of 7 induce a smaller number of modules, that are individually increasing in size. 

4. Medium storage regime in extremely diluted connectivity: retrieval region 

We now turn to the statistical mechanics analysis, and consider the immune network model composed of Nt 
T-clones (cr^, i = 1, . . . , Nt) and Nb B-clones (6^, /i = 1, . . . , Nb), such that the number ratio scales as 

lim Nb/N^ = a, S e (0, 1), a > (18) 

Nt ^00 

The effective interactions in the reduced network with helper cells only are described by the Hamiltonian 

Nt Nb 

^He) = -^EE^r^^^'^i' (19) 
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where the cytokine components G {0,=bl} are quenched random variables, independently and identically 
distributed according to 

p(er = 1) = p(e = -1) = c/2N^T^ p(er = o) = i - c/tv? (20) 

with 7 G [0, 1). The parameter r must be chosen such that T-L{cr\£) scales linearly with Nt^ and must therefore 
depend on 7 and 5. Heuristically, since the number of non-zero entries Mm in ^ generic pattern (^f , . . . , ^%^) 
is 0(A/'^~^), we expect that the network can retrieve a number of patterns of order 0{Nt /Mnz) = 0{N^). 
We therefore expect to see changes in r only when crossing the region in the (7, plane where pattern 
sparseness prevails over storage load (i.e. < 7, where the system can recall all patterns), to the opposite 
situation, where the load is too high and frustration by multiple inputs on the same entry drives the network 
to saturation (i.e. ^ > 7). To validate this scenario, which is consistent with our previous topological 
investigation, we carry out a statistical mechanical analysis, based on computing the free energy 

/(^) = - lim -^{logZNAP,0)i- (21) 
4.1. Free energy computation and physical meaning of the parameters 

If the number of patterns is sufficiently small compared to TV^, i.e. (5 < 1, we do not need the replica method; 
we can simply apply the steepest descent technique using the Nb <C Nt / log Nt Mattis magnetizations as 
order parameters: 

m = -1 log2 - hm log / dm ,-|^^+^-Kcosh(y^em))^_ 

•^^^^ /3 ^ Nt ^oc /3Nt J 

with m = (mi, . . . , tunb)^ C = (^^5 • • • 5 C^^) ^ • = ^^^/u- Rescaling of the order parameters via 
m^^cN^^'^^^ then gives 

/(^)=-4log2- lim 1 log/dme^-(-'^^?"""'^^+^'°«^°^'^(^'=^^^-"^))J, (23) 

p Nt ^oo (jNt J 

Hence, provided the limit exists, we may write via steepest descent integration: 



f{f3) = - ^ log 2 - ^ lim extrm 

p p Nt^oo 



(log cosh {(3cN^^ -m))^- ^AT^+^^-im^ 



(24) 



Differentiation with respect to the gives the self consistent equations for the extremum: 

ATi-T-e 

= (^^tanh(^c7V^^ • m))^. (25) 

With the additional new parameter ^, we now have two parameters with which to control separately two 
types of normalization: the normalization of the Hamiltonian, via r, and the normalization of the order 
parameters, controlled by ^. To carry out this task properly, we need to understand the physical meaning of 
the order parameters. This is done in the usual way, by adding suitable external fields to the Hamiltonian: 

Nb Nt 

n^n-Y^X^Y^C^ai (26) 

/X=l 2=1 

Now, with {g{a))a- = Z^^(/3,^) X]<t ^~^^^^'^^^(^) corresponding new free energy f{/3,X), 



with the short-hand A = (Ai . . . , Aat^). The new free energy is then found to be 

~2 



/(^, A) = - ^ log 2- \ lim extrm 



(log cosh {Pi . [c7V^m+A])> - ^TV^+'^-^m^ 



(28) 
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Upon differentiation with respect to we find (27) taking the form 

l^(EC''^^)- = /^ (enanh(/3c7V^C-m))^. (29) 

i—1 

We can then use expression ( [25| ) for to obtain the physcal meaning of our order parameters: 
m^= hm (^^tanh(/3c7V^|-m)). 

Nt^oo C 

Nt 

= hm {^^y^^a,),. (30) 

1 z=l 

Let us summarize the status of the various remaining control parameters in the theory, in the interest of 
transparency. Our model has three given external parameters: 



• 7 G [0, 1): this quantifies the dilution of stored patterns, via P(^f 7^ 0) = cA^^ ^ 



5 G (0, 1) and a > 0: these determine the number of stored patterns, via limAr^^oo Nb/N^ 



T 
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It also has two 'internal' parameters, which must be set in such a way for the statistical mechanical calculation 
to be self-consistent, i.e. such that various quantities scale in the physically correct way for Nt 00: 

• r > 0: this must ensure that the energy H = —{^N^^ ^^=iC^f=i ii^iY)cr scales as 0{Nt)-, 

• > 0: this must ensure that the order parameter = ((1/cA^^^^) ^iCTi)^ are of order 0(1). 

4-2. Setting of internal scaling parameters 

To find the appropriate values for the internal scaling parameters and r we return to the order parameter 



equation (25) and carry out the average over This gives 

= ^^(tanh (/3c7V:f ((e^)'m^ + ^ rm.)))^. (31) 

Nb 

= Ar^"^"^"^(tanh {pcN^{m^ + (32) 

Having non- vanishing in the limit Nt 00 clearly demands 6> + r<l — 7. If^>0 the will become 
independent of /3, which means that any phase transitions occur ar zero or infinite noise levels, i.e. we would 
not have defined the scaling of our Hamiltonian correctly. Similarly, if -\-r < 1 — 7 the effective local fields 
acting upon the (viz. the arguments of the hyperbolic tangent) and therefore also the expectation values 
(cr^)cr, would be vanishingly weak. We therefore conclude that a natural ansatz for the free exponents is: 

(t,^) = (1-7,0) (33) 

This simplifies the order parameter equation to 

Nb 

= (tanh {(3c{m^ + Xl^''^^)))^ (^^) 
Let us analyze this equation further. Since P(^f 7^ 0) ~ N^^ with 7 > 0, we can for Nt 00 replace in 



(31) the sum over i' ^ (Jl with the sum over all /i; the difference is negligible in the thermodynamic limit. In 



this way it becomes clear that for each solution of (31) we have G {— m,0,m}. Using the invariance of 



the free energy under — ^/i, we can from now on focus on solutions with non-negative magnetizations. 

If we denote with K < Nb the number of fi with ^ 0, then the value of m > is to be solved from 

K 

m= (tanh(/3cm(l + ^r)))^ (35) 
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It is not a priori obvious how the number K of nonzero magnetizations (i.e. the number of simultaneously 
triggered clones) can or will scale with Nt- We therefore set K = (j)N^ , in which the condition K < Nb 
then places the following conditions on (j) and 5': 5' G [0, (5], and ^ G [0, oo) if < (5 or ^ G [0, a] \i 5' = 5. 
We expect that if K is too large, equation (35) will only have the trivial solution for Nt oo, so there 
will be further conditions on and S' for the system to operate properly. If 5' > 7, the noise due to other 
condensed patterns (i.e. the sum over u) becomes too high, and m can only be zero: 



E 



L /i=i 



(36) 



On the other hand, if 5' < 7 this noise becomes negligible, and (35) reduces to the Curie- Weiss equation, 
whose solution is just the Mattis magnetization [5UJ [36l [H] . It follows that the critical case is the one where 
when S' = J. Here we have for Nt 00 the following equation for m: 



m 



= ^7r(A:|^)tanh(/3cm(l + A:)) 



with the following discrete noise distribution, which obeys 7r{—k\(j)) = 7T{k\(j)): 

7T{k\(l)) = <4,E^.i^^)^ 



(37) 



(38) 



4.3. Computation of the noise distribution 7r(/c) 

Given its symmetry, we only need to calculate 7r{k\(j)) for k > 0: 



lim 



dip 



K 

K^oo 27r ^ 



-iipk / Jipi 



lim 

K^oo 



I 

J —77 



-iipk 



27r 



(1 



-^(cosV^- 1)) 



■f 



-iTpk+(pc{cos ^ — l) 



= e 



n>0 l<n 



-iipk 



E- 

n>0 



tip 



-iip\n 



-iipk 



((/)c)" ^ n! 

n>0 Kn ^ ^ 



-iip{k-n+2l) 



1 



Z>0 



2 

k+2l 



l\{n-l)\ 
1 



e-^^Xfc((/)c) 



(39) 



l\{k^l)\ 

where Xj^{x) is the k-th modified Bessel function of the first kind [53]. These modified Bessel functions obey 

■■Xk-i{x) -Xk+i{x), 

(40) 



2-Xk{x) 

X 



2-^Xk{x) =Xk-i{x) +Xfe+i(x). 

The first identity leads to a useful recursive equation for 7r(A:|0), and the second identity simplifies our 
calculation of derivatives of 7T{k\(j)) with respect to ^, respectively: 

7r(A:-l|^) - 7r(A: + l|^) - 27r{k\^)^ = 0, (41) 



-7r{k\^) = c(i^(fc-l|(/)) + ^^(fe + l|(/>) - 7r{k\^)) 



(42) 
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4-4- Retrieval in the zero noise limit 

To emphasize the dependence of the recall overlap on 0, viz. the relative storage load, we will from now 
on write m m(j). With the abbreviation {g{k))k = ^k^i^\^)9i^)^ using (41) and the symmetry of 
7r(A:|^), we can transfer our equation (jSTl into a more convenient form: 



7710 = -( |'tanh(/3cm0(l + /c)) + tanh(/3cm0(l — /c)) )k 



2H 
1 

2 



\ ^ ^7r(fc-l|(/)) - 7r(fc + l|^) tanh(/3cm0fc) = ^{ktdinh{(3cm^k))k (43) 

In the zero noise limit /3 ^ oo, where tanh(/3?/) sgn(?/), this reduces to = ^(|^|)/c, or, equivalently, 
m0 = lim (tanh(/3cm(l + A:)))/c = (sign(l + A:))/e 

= = 40|'^) + 7r(l|(/.), (44) 

k>-l k<-l 

Hence we always have a nonzero rescaled magnetization, for any relative storage load (/). To determine for 
which value of (j) this state is most stable, we have to insert this solution into the zero temperature formula 
for the free energy and find the minimum with respect to (j). Here, with = rricj) for dl\ ji < K = (pN^ and 
= for /i > i^, the free energy (24) takes asymptotically the form 

/(/?) = ^c^Ml - ^ (log cosh {pcm^k))k - ^ log 2 (45) 

So for /3 — > 00, and using our above identity {\k\)k = (f>cm, we find that the energy density is 

u{<i>) = hm fip) = - cm{\k\)k = -\cHml (46) 

= _ic20(7r(O|0)+7r(l|</.))' (47) 

To see how this depends on ^ we may use ( [42| , and find 

Id,,, 1 o , d 

2 



\jnl - (l)cm^{^ - ^7r(O|0) + ^7r(2|0)^ = + ^07r(l|^) 

= -]^m^(m^-2n(l\^)) =-im0(^(O|(/))-^(l|(/))) <0 (48) 

The energy density u{(j)) is apparently a decreasing function of (/), which reaches its minimum when the 
number of condensed patterns is maximal, at ^ = a. However, the amplitude of each recalled pattern will 
also decrease for larger values of 

= ^Am) + ^^(ll*^) = < (49) 

Hence starts at mo = 1, due to 7r(/c|0) = and then decays monotonically with (j). Moreover, it follows 
from (1^1)^ < (P), = (E^<K(e^)')^ = that 

= {\k\)k/(l)c < 1/V^, ii(0) = -^c^^m^ > -^c (50) 

If we increase the number of condensed patterns, the corresponding magnetizations decrease in such a way 
that the energy density remains finite. 
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Figure 8. Left: energy density u versus the relative fraction of retrieved patterns, in terms oi cf) = ccf) 
and T = T/c = l//3c. The minimum energy density is reached when cf) is maximal, i.e. when all stored 
patterns are simultaneously retrieved, but with decreasing amplitude for each. Right: critical noise levels 
for different values of confirming that T^'^ = /3c = 1, independently of In both the panels, solid lines 
represent our theoretical predictions, while symbols represent data from numerical simulations on systems 
with Nt = 5 X 10^, J = 6 = 0.45, c = 2 and with with standard sequential Glauber dynamics. 



4.5. Retrieval at nonzero noise levels 



To find the critical noise level (if any) where pattern recall sets in, we return to equation (25), which for 
(r, 6>) = (1 — 7,0) and written in vector notation becomes 



m — ^-(^ tanh(/3c^ • m))^. 



(51) 



We take the inner product on both sides with m and obtain a simple inequality: 



m 



((^ • m) tanh(/3c^ • m))^ 



/37V^((| • mf f dx[l- tanh2(/3cx| • m)])^ 
Jo 

</37V^((|-m)2)^ = /3cm2 (52) 

Since m^(l — /3c) < 0, we are sure that m = for /3c < 1. At /3c = 1 nontrivial solutions of the previously 
studied symmetric type are found to bifurcate continuously from the trivial solution. This can be seen by 
expanding the amplitude equation (|43]) for small m: 

7710 = -— (/c tanh(/3cm0/c))/e 



1 



/3cm0 - -/3^c^m^(^^),/(/) + O(m^) 



(53) 



This shows that the symmetric solutions indeed bifurcate via a second-order transition, at the (/)-independent 
critical temperature Tc = c, with amplitude cx (/3c — 1)2 as /3c 1. All the above predictions 
are confirmed by the results of numerical simulations, and by solving the order parameter equations and 
calculating the free energy numerically, see Figure [8) 
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We can now summarize the phase diagram in terms of the scahng exponents (j^S). The number of 
stored patterns is Nb = aN^, of which K = (j)N^ can be recahed simultaneously, with = min(7, S): 



S <j 

5 = J 

6 > J 



= a, all stored patterns recalled simultaneously, with Curie- Weiss overlap m 
= a, all stored patterns recalled simultaneously, with reduced but finite m 
= 00, at most (j)N^ patterns recalled simultaneously, with ^ 00 and 



5. High storage regime in extremely diluted connectivity: absence of retrieval 

Let us finally consider the same network, composed of Nt T-clones ((7^, z = 1, . . . , Nt) and Nb B-clones (6^, 
/i = 1, . . . , Nb)^ but now at high storage load: 

lim Nb/Nt = a, a>0 (54) 

Nt ^00 



The effective interaction between T-cells is still described by the Hamiltonian (19), and the cytokine variables 
G {0, ±1} are generated from (20), but now we focus on the extremely diluted regime for the B-T network. 



i.e. 7 < 1. Again we must choose r such that the Hamiltonian will be of order Nt- Heuristically, since 
the number of non-zero entries A/'nz in a typical pattern (^f,...,^^^) scales as 0{N^~^)^ the number of 
patterns with non overlapping entries (i.e. those we expect to recall) will scale as 0{Nt /Mm) = 0{N^). 
The contribution from K = 0{N^) such condensed patterns to the Hamiltonian would then scale as 

K Nt 

-He ~ iVf - er<T,)^ ~ N^^KMl ~ N^^N}N^^'-'^'> ~ TV^"^"- (55) 

/i=l i=l 

The non-condensed patterns, of which there are TVnc = Nb — Nc ^ Nb = O(A^t), are expected to contribute 

iVnc Nt 2 

nNC - Y.^Y.^>^)^ - N-^N^.^MZ - N-^NtN].-^ ^ N^-^-\ (56) 

/i=l i—l 

Thus, we expect to have an extensive Hamiltonian for r = 1 — 7. 
5.1. Replica- symmetric theory 

In the scaling regime Nb = aNr we can no longer use saddle-point arguments directly in the calculation of 
the free energy. Instead we calculate the free energy for typical cytokine realizations, i.e. the average 

1 



^logZ^,(/3,0, (57) 



Here indicates averaging over all {^f }, according to the measure (20). The average over cytokine variables 
is done with the replica method, for K = 0{N^); full details are given in Appendix B. We solve the model at 
the replica symmetric (RS) level, which implies the assumption that the system has at most a finite number 
of ergodic sectors for Nt 00, giving 

/^7rs = extrm,g,r /5/Rs(m,^,r), (58) 

Nt ^00 

/3ksim,q,r) = - log2 + ^ar{/3cf{l-q) + - |(^-^|— ^ - \og[l- /3c{l-q)]) 

Jdz log cosh [/3c(m • ^ + zyc^)]^ (59) 

in which m = (mi, . . . , mx) denotes the vector oi K = (j)N^ condensed (i.e. potentially recalled) patterns, 
^ = (^^,...,^^), and Dz = (27r)~^/^e~^ /^dz. As in the analysis of standard Hopfield networks, this 
involves the Edward- Anderson spin-glass order parameter q ^20, ^36j and the Amit-Gutfreund-Sompolinsky 
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uncondensed- noise order parameter r [201 131] • We obtain self-consistent equations for the remaining RS 
order parameters {m^q^r) simply by extremizing /rs(^,^,^), which leads to 

= ^(^^ J tanh[/3c(m • ^^z^/ar)]^^, 
q = (^Jdz tanh^[/3c(m • ^^z^/ar)]^^, 

' =[i-Mi-,)r ^''^ 

As before we deal with the equation for by using the identity tanh(A) = tanh(^^ A) (since 
^ { — 1, 0, 1}) and by separating the term m^^^ from the sum m • ^: 

^/_. , . ,^ 



K 



j tanh [/3c(m^ + X^m^^ + ^V^)] )^ + O^N'^) 



Again we see that for Nt ^ oo we will only retain solutions with G {— m, 0,m} for all ji < K. Given 
the trivial sign and pattern label permutation invariances, we can without loss of generality consider only 
non-negative magnetizations, and look for solutions where = m for /a = 1 < K and zero otherwise. We 
then find 



m 



oo ^ 



k= — oo 



with 7r{k) given in (39). We can now use the manipulations employed in the previous section, to find 
k f 

Dz tanh[/3c(m/c + z^/ar)] 



(61) 

(62) 
(63) 



[l-/?c(l-g)]2- 

The corresponding free energy assumes the form 

/?/Rs(m,9,r) = - log 2 + ^ar{pc)\l-q) + ^^c^cPm^- 



-Pc{l-q) 



I 



Dz logcosh[/3c(m/c + zyc^ 



(64) 



Note that we recover the equations of the medium storage regime simply by putting a = 0. 



5.2. The zero noise limit 

We now show that in the high storage case the system behaves as a spin glass, even in the zero temperature 
limit j3 ^ oo where the retrieval capability should be largest. From (63) we deduce that g ^ 1 in the zero 
noise limit, while the quantity C = /3c(l — q) remains finite. Let us first send /3 ^ oo in equation (62): 



m = 



Dz sgn 



mk - 



Z^foL 



1-C 



mfc(l-C) 



(65) 
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Figure 9. Left panel: Behavior of ar(a) versus a in the spin-glass state (the inset shows only r(a) versus a), 
as calculated from the RS order parameter equations. This shows that r{a) goes to infinity as a approaches 
zero, such that ar(a) remains positive; this means that the noise due to non-condensed patterns can never 
be neglected. Right panel: behavior of the function versus S. Since G(E) < for o; > 0, equation ( |69| ) 

cannot have a solution for o; > 0, and hence no pattern recall is possible even at zero noise. 



with the error integral Erf(x) = {2/^/7^) dt e A second equation for the pair (m, C) fohows from (63): 



C = hm f3c(l- [ Dz tanh^ [/3c(mA: - 



/3^^So dm ( k 



_d 

dm \ k 
an 



Dz tanh 
mk{l-C) 



/3c ( 



mk 



■ z^jar ^ 

Zyjotq 
l-C 



)]), 



(l-C) (exp(- 



2a 



(66) 



We thus have two coupled nonlinear equations ( 65|66 ), for the two zero temperature order parameters m 
and C. They can be further reduced by introducing the variable S = m{l — C)/V2a, with which we obtain 

'\Erf{kE)'j^ (67) 
and rewriting S = m{l — C)/^/2a gives 



m 



C = 1 



^=l-v^s(Ws)\"\ 

m \(p Ik 



(68) 



Using ( 66 ) and excluding the trivial solution !5 = (which always exists, but represents the spin glass state 
without pattern recall) we obtain after some simple algebra just a single equation, to be solved for !5: 



2a = G(S) = 4(^Erf(/cS; 



One easily shows that 



^im G(S) = 0, 



Jim G(S) 



(69) 



(70) 



In fact further analytical and numerical investigation reveals that for S > the function G(E!) is strictly 
negative; see Figure [9] Hence there can be no m 7^ solution for a > 0, so the system cannot recall the 
patterns in the present scaling regime Nb = o^Nt- 
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6. Conclusions 

The immune system is a marvellous complex biological entity, able to execute reliably a number of very 
difficult tasks that allow living beings to survive in competitive interaction with a living environment. To 
accomplish this it relies on a huge ensemble of functions and agents. In particular, the adaptive part of the 
immune system relies on a broad ensemble of cells, e.g. B and T lymphocytes, and of chemical messengers, e.g. 
antibodies and cytokines. As for lymphocytes, one can distinguish between an 'effector branch', consisting 
of B-cells and killer T-cells, and an 'organizational branch', which coordinates the operation of the effector 
branch and consists mainly of helper and regulator T-cells. The latter control the activity of the effector 
branch through a rich and continuous exchange of cytokines, which are specific chemical messengers which 
elicit or suppress effector actions. 

From a theoretical point of view, a fascinating ability of the immune system is its simultaneous 
management, by helpers and suppressors, of several B-clones at once; this is a key ability, as it implies 
the ability to defend the host from simultaneous attacks by several pathogens. Indeed, we investigated this 
ability in the present study, as an emergent, collective, feature of a spin glass model of the immune network, 
that describes the adaptive response performed by B-cells under the coordination of helpers and suppressors. 
In particular, the focus of this paper is on the ability of the T-cells to coordinate an extensive number of 
B-soldiers, by fine-tuning the load of clones and the degree of dilution in the network. However, it is worth 
considering this parallel processing capability also from a slightly different perspective. Beyond the interest 
in multiple clonal expansions (which, in our language, is achieved trough signalling by +1 cytokines), the 
quiescence signals that are sent to the B-clones that are not expanding (which, in our language, is achieved 
trough signalling by —1 cytokines) is fundamental for homeostasis. In fact, B-cells that are not receiving 
a significant amount of signals undergo a depauperation process called "anergy" [30j[31j and eventually 
die. Hence, in the present multitasking network, the capability of signaling simultaneously to all clones is 
fundamental, and with implications beyond solely the management of simultaneous clonal expansions; we 
emphasize that within our approach this is achieved in a rather natural way. 

We first assumed that the number Nb of B-cells scales with the number Nt of T-cells as Nb = aN^^ 
with (5 < 1, and we modeled the interaction between B cells and T cells by means of an extremely diluted 
bipartite spin-glass where the former are addressed only by a subset of T cells whose cardinality scales like 
N^~^ , with 7 < 1. We proved that this system is thermodynamically equivalent to a diluted monopartite 
graph, whose topological properties are shown to depend crucially on the parameters 7 and S. In particular, 
when J > S the graph is fragmented into multiple disconnected components, each forming a clique or a 
collection of cliques typically connected via a bridge. Each clique corresponds to a pattern and this kind 
of arrangement easily allows for the simultaneous recall of multiple patterns. On the other hand, when 
7 < (5, the effective network can exhibit a giant component, which prevents the system from simultaneous 
pattern recall. These results on the topology of the immune network are then approached from a statistical 
mechanics angle: we analyse the operation of the system as an effective equilibrated stochastic process of 
interacting helper cells. We find that for j > S the network is able to retrieve perfectly all the stored patterns 
simultaneously, in perfect agreement with the topology-based prediction. When the load increases, i.e. when 
Nb becomes larger (so the exponent 5 is increased), overlaps among bit entries of the 'cytokine patterns' to 
be recalled become more and more frequent, and this gives rise to a new source of non- Gaussian interference 
noise that is non- negligible for 7 < J. If 7 = ^ the system is still able to retrieve all the patterns, but 
with a decreasing recall overlap. In the high storage case, for J = 1, the network starts to feel also the 
Gaussian noise due to non-condensed patterns, and this is found to destroy the retrieval states. Here the 
system behaves as a spin-glass, from which we deduce that an extremely diluted B-H network (i.e. one with 
7 < 1) is insufficiently diluted to sustain a high pattern load. Our predictions and results are tested against 
numerical simulations wherever possible, and we consistently find perfect agreement. 

Despite the fact that it is experimentally well established that helpers are much more numerous than 
B-cells, their relative sizes are still comparable in a statistical mechanical sense. The biological interest lies 
in the high storage regime, where the maximum number of pathogens can be fought simultaneously. From 
the present study we now know that to bypass the spin-glass structure of the phase space at this load level, 
a projection of the model into a finite-connectivity topology (7 = 1) is required. This, remarkably, is also in 
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agreement with the biological picture of highly-selective touch-interactions among B and T cells. It is both 
welcome and encouraging that both biological data and statistical mechanical theory have now converged 
to the same suggestion: that the most efficient and biologically most plausible operation regime is likely to 
be that of finite connectivity for the effective helper-helper immune network. This must therefore be the 
direction of the next stage of our research programme. 
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Appendix A. Topological properties 

Appendix A.l. Rigorous calculation of link probability 

We consider the bipartite graph S, and denote with pi the number of links stemming from node i G Vr- 
Note that pi also gives the number of non-null entries in the string (^^^, . . . , ^f^^) processed at node that is 

Nb 

All entries are i.i.d. variables ([5|, so for each node the number pi is distributed according to 

When considering two distinct nodes i, j G Vt, the number I of shared nearest-neighbors corresponds to the 
number of non-nuh matchings between the related strings, and this is distributed according to 

-1 



P(£|p„p,-,iVB) 



Nb\ Nb 



(A.3) 



The average {tj p^^p- then follows as 

(^)p.,P, = PiP,/NB. (A.4) 
By further averaging over P{p\Nb-,Nt-,c^^) we get 

{£) = {pf/Ns. (A.5) 

Fluctuations scale as (^^) — (^)^ ~ (^)^, where, from the distribution above, (p) = cNb/2N^. Upon choosing 
Nb = olNt^ we then get {tj ~ A/"^"^^, which vanishes if 27 > 5. Two strings of any two nodes apparently do 
not display significant matching, so there is no link between them, consistent with the results of Section [3j 
On the other hand, \i 2^ < 5 so that {€) ^ 1, we can approximate P{J\Nb, Nt^j^ c) (the probability 
of two randonly drawn nodes in the effective NT-node graph having a link J) with P{J\{i)^ Nt^j^c): the 
probability that a random walk of length Nb with a waiting probability ends at distance J from the 
origin is approximated by the probability that a simple random walk of length {£) ends at the same distance 
(with proper normalization to account for parity features). In particular, 

P{J = 0\Nb,Nt,i,c) « ( ^^^^ )2-<^> « \/2A^ (A.6) 

(using StirUng's formula in the last step). The expected link probability between two nodes follows as 

^7-^/2 

P( J ^ 0\Nb,Nt, 7, c) = 1 - P( J = 0\Nb, Nt, 7, c) « 1 - \ ^ , (A.7) 

V Tra c 

It is easy to see that when 2j = S the link probability is finite and smaller than 1, while when 2j < S it 
converges to 1 in the thermodynamic limit, consistent with the results of Section 3. 



22 



Appendix A. 2. Generating function approach to percolation in the bipartite graph 

Let us consider a bipartite graph S, made of two sets of nodes Vt (of size Nt) and Vb (of size TVb), with 
both sizes diverging. The degree distribution for the two parts are pk and g/e, respectively, with ^j^Pkk = fi 
and ^j^Qkk = i^. Fohowing [39j, we introduce the fohowing generating functions 

Nt Nb 

Mx) = "^Pkx^, go{x) = ^g/ex^, (A.8) 

/c=0 /c=0 

fi{x) = -^fo{x), gi{x) = -^go{x), (A.9) 
/i dx u ax 

We note that fi{x) and gi{x) are the generating functions for the degree distribution of a vertex reached 
fohowing a randomly chosen edge (here the degree does not include the link along which we arrived). One 
always has (i/Nt = v/Nb^ and /o(l) = ^o(l) = /i(l) = ^i(l) = 1 (by construction). 

Next we introduce dilution. We define the matrix t, whose element t^i represents the probability that 
a directed link going from a node in part /c to a node in part ^ exists. For bipartite graphs, t is simply a 
2x2 matrix with zero diagonal entries. We can now write the generating functions for the distributions of 
occupied edges attached to a vertex chosen randomly as follows [39j: 

/o(x|t) = /o(l + (x - l)ti2), /i(x|t) = /i(l + (x - l)ti2), (A.IO) 

qM^) = 9o{l + (^ - 1)^21), 9i{x\t) = gi{l + (x - l)t2i). (A.ll) 

Let us now consider a node i G Vr, with Zi neighbors (where Zi is distributed according to pk). Due to 
the dilution, only a fraction of the links that could connect to i will be present. The nodes in the second 
part that are reached from i will, in turn, have a number of links hitting some nodes in Vt- The generating 
function Fo{x\t) of the distribution of nodes in the first part which are involved in both steps is 



CX) 



Fo(x|t)= ^ EPfc(m)*^2(l-*i2)'"™[5i(a;;t)r 

m=0 k—m 

= /o(5i(a:|t)|t) = /o(l + (5i(x|t) - 1)^12). (A.12) 

In fact, in the expansion of [giix] t)]^, the coefficient of x'^ is simply the probability that m randomly reached 
nodes are connected to a set of n other nodes. If we choose an edge rather than a node we have, analogously 

Fi(x|t) = /i(l + {gi{x\t) - l)ti2), (A.13) 

The corresponding generating functions found upon starting with a note in the part Vb have analogous 
definitions, and will be written as Gq and Gi. 

The generating function Hq for the distribution P{s\t) of the size s of the components (connected sub- 
graphs) which one can detect is i^o(^|t) = Xl5^(5|t)x^. Similarly, i^i(x|t) will be the generating function 
for the size of the cluster of connected vertices that we reach by following a randomly chosen vertex. We 
note that in the highly-diluted regimes we can exploit the fact that the probability of finding closed loops is 
0{N^^) (so Hq and Hi do not include the giant component), which allows us to write the explicit expressions 

Ho{x\t) = xFo{Hi{x\t)\t), Hi{x\t) = xFi{Hi{x\t)\t). (A.14) 

This then gives for the average cluster size: 

is) = Hi^m = 1 + F^mH[m = i - (a.i5) 



where we used -H"((l|t) = 1 + F{{l\t) H{{l\t). As for Fq and Fi, recalling Eqs. (A.12) and (A.13) we get 



F^(l|t) = /o(5i(l|i2i)|ii2)5i(l|i2i) = /o(l|ii2)<?o(l)i2i, and analogous formulae for Fi(l|t). Therefore, 

This expression diverges for 

. . 1 M 17^ 

'''' /((iK(i) (E,i(i-i)P,)(Efcfc(fc-i)*)' 
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signalling the phase transition at which a giant-component first appears. Assuming t to be symmetric, as 
we are interested in equilibrium statistical-mechanical descriptions of the system, the left hand side of the 
previous equations simplifies into t'^ = t^2- for the degree distributions pk and qk, the case considered 
in the main text is the easiest one, viz. an originally fully-connected bipartite network that has been 
progressively diluted, in such a way that pk = Sk,NT-i Qk = ^k,NB-i- Hence, we immediately find 

^ 1/NtNb- (A.18) 

As mentioned earlier, the distribution for the size of the small components reached from a randomly chosen 
node z G Vt has generating function i^o(^|t) = xFi{Hi{x\t)\t). 

The generating function approach implicitly assumes that small components are always tree-like, i.e. 
the probability of finding closed loops in finite components is negligible in the large system size limit. Hence, 
if t = 1 — c/N^, then 7 > 1, so we are in effect considering how the system approaches the percolation 
threshold from the underpercolated regime. Therefore, Hi{x\t) ^ x in such a way that 

Fi{x\t) = {1 - t + t[l + (x - l)tf^-^}^^-\ (A.19) 

Upon substituting ( A.19| ) into the expression for Ho{x\t)^ followed by a numerical inverse Laplace transform. 



we obtain clear evidence that, when s is small, the leading term for P{s\t) decreases exponentially with 
s. This prediction is confirmed by numerical simulations, see the right panel of Fig 4. More generally, the 
distribution of component sizes can be written as 

Nb 

P{s\Ns,Nt,c,^) = (^^) E {^f){l-pr^'''-Hl-pf''^-'^C{l,s), (A.20) 

where we accounted for the probability of choosing a component (C Vr) of size s linked through, overall, / 
nodes (C Vb), and for the probability that this sub-graph is disconnected from the remaining nodes; C{l^s) 
is the probability that a subgraph made of s -\- I elements is connected. Now, the s nodes G Vr can be 
partitioned into sub-graphs {si,52, where includes all nodes linked to exactly k nodes among the 

/ selected. Therefore, we can write 

c{i,s)= E nf'^i '7,' nip'a-p)'"']'^"' (A.21) 

A simple upper bound for C{l^s) follows by imposing that all 5 + / nodes are connected to at least one node 

C{1, s) < C{1, s) = [1 - (1 - p)t [1 - (1 - (A.22) 

This does not imply that the whole sub-graph is connected, but the bound is a good approximation when the 
link probability is either low or high. Using this approximation and the expression p = c/N^, we find that: 
for relatively small 7 only the case s ~ 0{Nt) and / ~ 0{Nb) has non- vanishing probability, for relatively 
large 7 only the case with s and / finite has non- vanishing probability, and for intermediate values of 7 both 
these extreme cases are possible. In Fig. Al we show a comparison between analytical and numerical results. 



Appendix B. Appendix B: free energy evaluation using the replica method 



In this appendix we calculate the free energy per spin of the system characterised by the Hamiltonian (19), 
within the replica-symmetric (RS) ansatz, for the scaling regime Nb = aNr. Let us start by introducing 
the partition function Z^^W^O disorder- averaged free energy /: 

ZM(3,0 = J2ei^''r^AT::f.i^i^^.^. (B.l) 



1 



/ = - lim — log ZnAP,0, (B.2) 



where ••• denotes averaging over the randomly generated {^f}. If we use the replica identity logZ = 
lim^^o log Z^, and separate the contributions from the K condensed patterns from those of the aNr—K 
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Figure Al. Distribution for the cluster size P{s\Nb, Nt, c,j) measured numerically over a simulated 
graph Q with Nt = 10^ nodes. The parameters S = 1, a = 0.5 and c = 1 are kept fixed, while 7 is varied 
(see legend). In the interest of clarity, we plotted the analytical estimate of Eqs. |A.2'ol and [A.22| only for 
7 = 0.78. 



non- condensed ones we get 



/ = — lim lim ^ 



n \r log 

^0 BuNt ^ 



e2 



-log 2- lim lim- log(e2^^ 2^^,=l 2^c.=ll2^i=l^^ ) 



(B.3) 



' t7i,...,t7" 

We compute the non-condensed contributions first, using the standard tool of Gaussian linearisation, and 
the usual short-hands Dz = (27r)~^/^e~^ /^dz and Dz = YTl^i Dzc,: 

Nb-K 



Nt 



Nb-K 



Dz Y[ (l-cNp^cNp cosh{^/pN~^^^"^afza) 

i=l Q!=l 
Nt n 

J Dz e^^^^T^'^ ^«,/3=i EfJi afaf +0(Ar^-^--'^) 



Nb-K 



Nb-K 



(B.4) 



Now it is evident, as in our earlier calculations, that the correct scaling for large Nt requires choosing 
T = 1 — 7. For the correction term in the exponent this gives 0{N^-^^-'') = 0{N:}-'), which is indeed 
vanishing since 7 < 1. We now arrive at 



^cz^ EfJi afaf +o(iv7-^ 



(B.5) 
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We next introduce parameters {qafs} and their conjugates {qafd}^ by inserting partitions of unity: 

Nt 

T — r I I 

1 



271 /Nj 



a(3 i=l a(3 

Substituting ( |B.6| ) into ( |B.5[ ) gives the contribution to the partition function of non-condensed patterns: 



Q!/3 



.i^TE«,/3'?«/3g«/3 + (iVi3-i^)log/DZ e2 



g-i Ei E«,/3 ^*"'?«/3Crf ^ 



The contribution from condensed pattern, see ( |B.3[ ), is 

Dm e 



v/^3iv(.^-^)/^ E^<^ E2=i «r<"»i: 



with m = {m(^} G K"-'^. If we rescale Cy/]3N^ '^^^'^m^ this becomes 



nif/2 



(B.7) 
(B.8) 

(B.9) 



Inserting ( B.7|B.9 ) into (B.3) gives the fohowing expression for the free energy per spin: 
7=-^l°g2-^hm^miog(cW-^) 

Q!/3 



X e L ^ 



xn(e'^ 

i=l 

The number of order parameters being integrated over is of order so corrections to the saddle-point 
contribution wih be of order 0{K\o'gN jN). To proceed via steepest descent we must therefore impose 
K <C A^t/ log Since also the energy term ^a<K should be of order one, as well as the individual 
components of m, the only natural choice is = O(N^). Under this scaling condition we then find 



■■). J 



(B.IO) 



^ log 2 - lim lim -^extrm,g,g/(m, {q, q}) 



(B.ll) 



with 



/(m, {q,q}) = iY^ qa^qafB ^ a\og Bz e'^^"" 

a, (3 



o 2 ^ 

Now we can use the replica symmetry ansatz, and demand that the relevant saddle-point is of the form 



(B.14) 



From now on we will denote m = (m^, . . . , m^) and ^ = (^^, . . . , ^^). After some simple algebra we can 
take the limit n ^ 0, and find that our free energy simplifies to 

(B.15) 



with 



/^/rs = , extr^,g,^ /Rs(m,g,r) 



1 Pc^ 
/Rs(m,g,r) = -log2 + -ar(/3c)^(l-g) + 



Vl-Ml 



f3c{l-q) 



\og[l-pc{l-q)] 



J Dz log cosh[/3c(m • ^ + z\^ar)] 



(B.16) 
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